Search CORE

2,775 research outputs found

Interactive exploration of population scale pharmacoepidemiology datasets

Author: Abadi M.
Furu K.
Salathé M.
Ventola C. L.
Wishart D. S.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/05/2020
Field of study

Population-scale drug prescription data linked with adverse drug reaction (ADR) data supports the fitting of models large enough to detect drug use and ADR patterns that are not detectable using traditional methods on smaller datasets. However, detecting ADR patterns in large datasets requires tools for scalable data processing, machine learning for data analysis, and interactive visualization. To our knowledge no existing pharmacoepidemiology tool supports all three requirements. We have therefore created a tool for interactive exploration of patterns in prescription datasets with millions of samples. We use Spark to preprocess the data for machine learning and for analyses using SQL queries. We have implemented models in Keras and the scikit-learn framework. The model results are visualized and interpreted using live Python coding in Jupyter. We apply our tool to explore a 384 million prescription data set from the Norwegian Prescription Database combined with a 62 million prescriptions for elders that were hospitalized. We preprocess the data in two minutes, train models in seconds, and plot the results in milliseconds. Our results show the power of combining computational power, short computation times, and ease of use for analysis of population scale pharmacoepidemiology datasets. The code is open source and available at: https://github.com/uit-hdl/norpd_prescription_analyse

arXiv.org e-Print Archive

Crossref

Impact of Judicial Commentary Concerning Eyewitness Identifications on Jury Decision Making, The

Author: Katzev Richard D.
Wishart Scott S.
Publication venue: Northwestern Pritzker School of Law Scholarly Commons
Publication date: 01/01/1986
Field of study

Northwestern University Illinois, School of Law: Scholarly Commons

Collaborative Privacy Policy Authoring in a Social Networking Context.

Author: Corapi D
Marinovic S
Sloman M
Wishart R
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Recent years have seen a significant increase in the popularity of social networking services. These online services enable users to construct groups of contacts, referred to as friends, with which they can share digital content and communicate. This sharing is actively encouraged by the social networking services, with users privacy often seen as a secondary concern. In this paper we first propose a privacy-aware social networking service and then introduce a collaborative approach to authoring privacy policies for the service. In addressing user privacy, our approach takes into account the needs of all parties affected by the disclosure of information and digital content. © 2010 Crown

CiteSeerX

Crossref

Spiral - Imperial College Digital Repository

MetaboLab - advanced NMR data processing and analysis for metabolomics

Author: A Williams
Christian Ludwig
D Tramesel
DS Wishart
DS Wishart
F Delaglio
F Dieterle
HM Parsons
IA Lewis
JD van Beek
K Wüthrich
MR Viant
S Tiziani
S Tiziani
S Tiziani
U Günther
U Günther
Ulrich L Günther
Publication venue
Publication date: 01/01/2011
Field of study

Background\ud Despite wide-spread use of Nuclear Magnetic Resonance (NMR) in metabolomics for the analysis of biological samples there is a lack of graphically driven, publicly available software to process large one and two-dimensional NMR data sets for statistical analysis.\ud \ud Results\ud Here we present MetaboLab, a MATLAB based software package that facilitates NMR data processing by providing automated algorithms for processing series of spectra in a reproducible fashion. A graphical user interface provides easy access to all steps of data processing via a script builder to generate MATLAB scripts, providing an option to alter code manually. The analysis of two-dimensional spectra (1H,13C-HSQC spectra) is facilitated by the use of a spectral library derived from publicly available databases which can be extended readily. The software allows to display specific metabolites in small regions of interest where signals can be picked. To facilitate the analysis of series of two-dimensional spectra, different spectra can be overlaid and assignments can be transferred between spectra. The software includes mechanisms to account for overlapping signals by highlighting neighboring and ambiguous assignments.\ud \ud Conclusions\ud The MetaboLab software is an integrated software package for NMR data processing and analysis, closely linked to the previously developed NMRLab software. It includes tools for batch processing and gives access to a wealth of algorithms available in the MATLAB framework. Algorithms within MetaboLab help to optimize the flow of metabolomics data preparation for statistical analysis. The combination of an intuitive graphical user interface along with advanced data processing algorithms facilitates the use of MetaboLab in a broader metabolomics context.\ud \u

Crossref

Springer - Publisher Connector

University of Birmingham Research Portal

Directory of Open Access Journals

PubMed Central

The management of radial scars of the breast - does core biopsy help?

Author: Bobrow L.
Britton P. D.
McIntosh S.
Purushotham A. D.
Ravichandran D.
Wishart G. C.
Publication venue
Publication date: 01/01/2002
Field of study

Queen's University Belfast Research Portal

How and Why is An Answer (Still) Correct? Maintaining Provenance in Dynamic Knowledge Graphs

Author: Agrawal P.
Arab B. S.
Babu S.
Ives Z. G.
Wishart D. S.
Wylot M.
Publication venue
Publication date: 29/07/2020
Field of study

Knowledge graphs (KGs) have increasingly become the backbone of many critical knowledge-centric applications. Most large-scale KGs used in practice are automatically constructed based on an ensemble of extraction techniques applied over diverse data sources. Therefore, it is important to establish the provenance of results for a query to determine how these were computed. Provenance is shown to be useful for assigning confidence scores to the results, for debugging the KG generation itself, and for providing answer explanations. In many such applications, certain queries are registered as standing queries since their answers are needed often. However, KGs keep continuously changing due to reasons such as changes in the source data, improvements to the extraction techniques, refinement/enrichment of information, and so on. This brings us to the issue of efficiently maintaining the provenance polynomials of complex graph pattern queries for dynamic and large KGs instead of having to recompute them from scratch each time the KG is updated. Addressing these issues, we present HUKA which uses provenance polynomials for tracking the derivation of query results over knowledge graphs by encoding the edges involved in generating the answer. More importantly, HUKA also maintains these provenance polynomials in the face of updates---insertions as well as deletions of facts---to the underlying KG. Experimental results over large real-world KGs such as YAGO and DBpedia with various benchmark SPARQL query workloads reveals that HUKA can be almost 50 times faster than existing systems for provenance computation on dynamic KGs

arXiv.org e-Print Archive

Crossref

Simulation of primordial object formation

Author: Couchman H. M. P.
Galli D.
Gould R. J.
H. M. P. Couchman
Hill J. K.
Kashlinsky A.
Puy D.
Rapp D.
Shaver P. A.
Todd M. Fuller
White S. D. M.
Wishart A. W.
Publication venue: 'University of Chicago Press'
Publication date: 06/03/2000
Field of study

We have included the chemical rate network responsible for the formation of molecular Hydrogen in the N-body hydrodynamic code, Hydra, in order to study the formation of the first cosmological at redshifts between 10 and 50. We have tested our implementation of the chemical and cooling processes by comparing N-body top hat simulations with theoretical predictions from a semi-analytic model and found them to be in good agreement. We find that post-virialization properties are insensitive to the initial abundance of molecular hydrogen. Our main objective was to determine the minimum mass (

M_{SG}(z)

) of perturbations that could become self gravitating (a prerequisite for star formation), and the redshift at which this occurred. We have developed a robust indicator for detecting the presence of a self-gravitating cloud in our simulations and find that we can do so with a baryonic particle mass-resolution of 40 solar masses. We have performed cosmological simulations of primordial objects and find that the object's mass and redshift at which they become self gravitating agree well with the

M_{SG}(z)

results from the top hat simulations. Once a critical molecular hydrogen fractional abundance of about 0.0005 has formed in an object, the cooling time drops below the dynamical time at the centre of the cloud and the gas free falls in the dark matter potential wells, becoming self gravitating a dynamical time later.Comment: 45 pages, 17 figures, submitted to Ap

arXiv.org e-Print Archive

Crossref

CS23D: a web server for rapid protein structure generation using NMR chemical shifts and sequence data

Author: Altschul
Atkinson
BAYLEY
Berjanskii
Bryant
Canutescu
D. Arndt
D. S. Wishart
G. Lin
Gong
Grishaev
Herrmann
Holak
J. Zhou
Kouranov
Linge
M. Berjanskii
Neal
P. Tang
Schmidt
Simons
Tjandra
Walther
Wishart
Wishart
W thrich
Zhang
Publication venue: Oxford University Press
Publication date
Field of study

CS23D (chemical shift to 3D structure) is a web server for rapidly generating accurate 3D protein structures using only assigned nuclear magnetic resonance (NMR) chemical shifts and sequence data as input. Unlike conventional NMR methods, CS23D requires no NOE and/or J-coupling data to perform its calculations. CS23D accepts chemical shift files in either SHIFTY or BMRB formats, and produces a set of PDB coordinates for the protein in about 10–15 min. CS23D uses a pipeline of several preexisting programs or servers to calculate the actual protein structure. Depending on the sequence similarity (or lack thereof) CS23D uses either (i) maximal subfragment assembly (a form of homology modeling), (ii) chemical shift threading or (iii) shift-aided de novo structure prediction (via Rosetta) followed by chemical shift refinement to generate and/or refine protein coordinates. Tests conducted on more than 100 proteins from the BioMagResBank indicate that CS23D converges (i.e. finds a solution) for >95% of protein queries. These chemical shift generated structures were found to be within 0.2–2.8 Å RMSD of the NMR structure generated using conventional NOE-base NMR methods or conventional X-ray methods. The performance of CS23D is dependent on the completeness of the chemical shift assignments and the similarity of the query protein to known 3D folds. CS23D is accessible at http://www.cs23d.ca

Crossref

PubMed Central

SMPDB: The Small Molecule Pathway Database

Author: Alex Frolkis
An Chi Guo
Bijaya Gautam
Caspi
Craig Knox
David D. Hau
David S. Wishart
Emilia Lim
Hamosh
Ideker
Jianguo Xia
Joshi-Tope
Karp
Ma
Michal
Michal
Okuda
Phillip Liu
Pico
Sangkuhl
Savita Shrivastava
Soh
Son Ly
Strömbäck
Suderman
Timothy Jewison
UniProt Consortium
Vivian Law
Wishart
Wishart
Wishart
Wishart
Wishart
Yongjie Liang
Publication venue: Oxford University Press
Publication date
Field of study

The Small Molecule Pathway Database (SMPDB) is an interactive, visual database containing more than 350 small-molecule pathways found in humans. More than 2/3 of these pathways (>280) are not found in any other pathway database. SMPDB is designed specifically to support pathway elucidation and pathway discovery in clinical metabolomics, transcriptomics, proteomics and systems biology. SMPDB provides exquisitely detailed, hyperlinked diagrams of human metabolic pathways, metabolic disease pathways, metabolite signaling pathways and drug-action pathways. All SMPDB pathways include information on the relevant organs, organelles, subcellular compartments, protein cofactors, protein locations, metabolite locations, chemical structures and protein quaternary structures. Each small molecule is hyperlinked to detailed descriptions contained in the Human Metabolome Database (HMDB) or DrugBank and each protein or enzyme complex is hyperlinked to UniProt. All SMPDB pathways are accompanied with detailed descriptions, providing an overview of the pathway, condition or processes depicted in each diagram. The database is easily browsed and supports full text searching. Users may query SMPDB with lists of metabolite names, drug names, genes/protein names, SwissProt IDs, GenBank IDs, Affymetrix IDs or Agilent microarray IDs. These queries will produce lists of matching pathways and highlight the matching molecules on each of the pathway diagrams. Gene, metabolite and protein concentration data can also be visualized through SMPDB’s mapping interface. All of SMPDB’s images, image maps, descriptions and tables are downloadable. SMPDB is available at: http://www.smpdb.ca

Crossref

PubMed Central